Array Based Dynamic Graph #135

plofgren · 2015-01-27T00:02:15Z

I've implemented a dynamic directed graph, using an ArrayBuffer of IntArrayList (from the fastutil library). If n nodes are used, O(n) objects are created, independent of the number of edges.

Comparison to the current class: Note that the current dynamic graph class SynchronizedDynamicGraph uses a ConcurrentHashMap to store nodes, which has more overhead than an ArrayBuffer when the nodes are mostly sequential. Also, each node stores an ArrayBuffer[Int], which I belive boxes the ints into objects, creating overhead relative to fastutil's IntArrayList. It also has non-trivial synchronization overhead; in particular when iterating over the neighbors of a node, the neighbors are first copied into a new array, then an iterator to the new array is returned. The current class is better when the node Ids are very non-sequential or when automatic synchronization is needed, but otherwise I believe the new class is more efficient. In a very informal comparison on a graph with several hundred million edges, the currrent class wouldn't load using 30GB of heap, while the new class fit the graph in 7.3 GB of RAM.

@pankajgupta Could you take a look at this, or point me to someone else?

Thanks!

Array based dynamic graph

pankajgupta · 2015-01-27T05:07:56Z

cassovary-core/src/main/scala/com/twitter/cassovary/util/FastUtilUtils.scala

@@ -61,4 +61,6 @@ object FastUtilUtils {

  def newInt2IntOpenHashMap(): mutable.Map[Int, Int] =
    new Int2IntOpenHashMap().asInstanceOf[jutil.Map[Int, Int]]
+
+  def intArrayListToSeq(list: IntArrayList): Seq[Int] = list map { _.toInt }


list.toIntArray() will also work, but this is fine too.

this will be O(n) btw

You don't need to use this at all (see my previous comment)

Good point about O(n)! [edit: I'm now using the IndexedSeq wrapper you propose below, so it should be O(1)]

pankajgupta · 2015-01-27T06:42:51Z

Consider adding benchmark in cassovary-benchmark subproject as well to demo the performance of this.

szymonm · 2015-01-27T07:31:31Z

cassovary-core/src/main/scala/com/twitter/cassovary/graph/ArrayBasedDynamicDirectedGraph.scala

+  // outboundLists(id) contains the outbound neighbors of the given id,
+  // or null if the id is not in this graph.
+  // If we aren't storing outbound neighbors, outboundLists will always remain size 0.
+  private val outboundLists = new ArrayBuffer[IntArrayList]


You can also consider using http://fastutil.di.unimi.it/docs/it/unimi/dsi/fastutil/objects/ObjectArrayList.html instead ArrayBuffer

Do you think it is more efficient than ArrayBuffer?

plofgren · 2015-01-28T21:44:52Z

I probably won't get to the benchmark for a few days, but I'll post once I have it, and I'll make a call then about multi-threading.

pankajgupta · 2015-01-29T18:14:16Z

ok regarding benchmark

plofgren · 2015-01-29T21:11:13Z

I needed a mutable undirected graph, so I just added code to support undirected graphs (i.e. stored direction Mutual).
@pankajgupta @szymonm Please comment on the new commit when you get a chance.

plofgren · 2015-01-29T21:11:25Z

I actually didn't realize until today that "Mutual" meant undirected and was starting to write my own UndirectedGraph wrapper class when I noticed it. Should I add a comment to StoredGraphDir along the lines of
Mutual, // In a mutual graph, the outbound and inbound neighbors of each node are equal, and space is typically saved by only storing the outbound neighbors of each node
?

pankajgupta · 2015-01-29T21:41:31Z

LGTM. Yes, Mutual has that intention, but it will be better for there to be a barebones UndirectedGraphWrapper class for readability and to avoid the surprise you ran into.
So let's close this PR and please feel free to take up that into a new PR.

A single threaded dynamic graph implementation that keeps nodes and each node's adjacencies in native arrays.

plofgren and others added 4 commits January 24, 2015 19:48

New Dynamic Graph based on IntArrayList (fastutil).

2731265

Adding ArrayBasedDynamicDirectedGraph to GraphReader.

2de7e96

Responding to Santiago's comments at 2731265#commitcomment-9433152

06a2b73

Merge pull request #1 from plofgren/array-based-dynamic-graph

1ee4c8a

Array based dynamic graph

pankajgupta reviewed Jan 27, 2015
View reviewed changes

szymonm reviewed Jan 27, 2015
View reviewed changes

plofgren added 2 commits January 27, 2015 19:03

Responding to Pankaj and Szymon's comments

3d78754

Responding to Pankaj and Szymon's comments

e4f38ba

Responding to Pankaj and Szymon's comments

ab0fef0

Added support for undirected graphs

7e32ac9

pankajgupta added a commit that referenced this pull request Jan 29, 2015

Merge pull request #135 from plofgren/master

595d3e7

A single threaded dynamic graph implementation that keeps nodes and each node's adjacencies in native arrays.

pankajgupta merged commit 595d3e7 into twitter:master Jan 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Array Based Dynamic Graph #135

Array Based Dynamic Graph #135

plofgren commented Jan 27, 2015

pankajgupta Jan 27, 2015

pankajgupta Jan 27, 2015

szymonm Jan 27, 2015

plofgren Jan 27, 2015

pankajgupta commented Jan 27, 2015

szymonm Jan 27, 2015

plofgren Jan 27, 2015

szymonm Jan 28, 2015

plofgren commented Jan 28, 2015

pankajgupta commented Jan 29, 2015

plofgren commented Jan 29, 2015

plofgren commented Jan 29, 2015

pankajgupta commented Jan 29, 2015

Array Based Dynamic Graph #135

Array Based Dynamic Graph #135

Conversation

plofgren commented Jan 27, 2015

pankajgupta Jan 27, 2015

Choose a reason for hiding this comment

pankajgupta Jan 27, 2015

Choose a reason for hiding this comment

szymonm Jan 27, 2015

Choose a reason for hiding this comment

plofgren Jan 27, 2015

Choose a reason for hiding this comment

pankajgupta commented Jan 27, 2015

szymonm Jan 27, 2015

Choose a reason for hiding this comment

plofgren Jan 27, 2015

Choose a reason for hiding this comment

szymonm Jan 28, 2015

Choose a reason for hiding this comment

plofgren commented Jan 28, 2015

pankajgupta commented Jan 29, 2015

plofgren commented Jan 29, 2015

plofgren commented Jan 29, 2015

pankajgupta commented Jan 29, 2015